Machine-printed and hand-written text lines identification
Identifieur interne : 001A32 ( Main/Exploration ); précédent : 001A31; suivant : 001A33Machine-printed and hand-written text lines identification
Auteurs : U. Pal [Inde] ; Bidyut Baran Chaudhuri [Inde]Source :
- Pattern Recognition Letters [ 0167-8655 ] ; 2000.
Abstract
There are many types of documents where machine-printed and hand-written texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of 98.6%.
Url:
DOI: 10.1016/S0167-8655(00)00126-4
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000601
- to stream Istex, to step Curation: 000593
- to stream Istex, to step Checkpoint: 001086
- to stream Main, to step Merge: 001B25
- to stream Main, to step Curation: 001A32
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title>Machine-printed and hand-written text lines identification</title>
<author><name sortKey="Pal, U" sort="Pal, U" uniqKey="Pal U" first="U." last="Pal">U. Pal</name>
</author>
<author><name sortKey="Chaudhuri, B B" sort="Chaudhuri, B B" uniqKey="Chaudhuri B" first="B. B." last="Chaudhuri">Bidyut Baran Chaudhuri</name>
<affiliation><country>Inde</country>
<placeName><settlement type="city">Calcutta</settlement>
<region type="province">Bengale-Occidental</region>
</placeName>
<orgName type="lab" n="5">Institut indien de statistiques</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9</idno>
<date when="2001" year="2001">2001</date>
<idno type="doi">10.1016/S0167-8655(00)00126-4</idno>
<idno type="url">https://api.istex.fr/document/49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000601</idno>
<idno type="wicri:Area/Istex/Curation">000593</idno>
<idno type="wicri:Area/Istex/Checkpoint">001086</idno>
<idno type="wicri:doubleKey">0167-8655:2001:Pal U:machine:printed:and</idno>
<idno type="wicri:Area/Main/Merge">001B25</idno>
<idno type="wicri:Area/Main/Curation">001A32</idno>
<idno type="wicri:Area/Main/Exploration">001A32</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a">Machine-printed and hand-written text lines identification</title>
<author><name sortKey="Pal, U" sort="Pal, U" uniqKey="Pal U" first="U." last="Pal">U. Pal</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B.T. Road, Calcutta 700 035</wicri:regionArea>
<wicri:noRegion>Calcutta 700 035</wicri:noRegion>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
</affiliation>
</author>
<author><name sortKey="Chaudhuri, B B" sort="Chaudhuri, B B" uniqKey="Chaudhuri B" first="B. B." last="Chaudhuri">Bidyut Baran Chaudhuri</name>
<affiliation wicri:level="1"><country xml:lang="fr">Inde</country>
<wicri:regionArea>Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, 203 B.T. Road, Calcutta 700 035</wicri:regionArea>
<wicri:noRegion>Calcutta 700 035</wicri:noRegion>
<placeName><settlement type="city">Calcutta</settlement>
<region type="province">Bengale-Occidental</region>
</placeName>
<orgName type="lab" n="5">Institut indien de statistiques</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Inde</country>
<placeName><settlement type="city">Calcutta</settlement>
<region type="province">Bengale-Occidental</region>
</placeName>
<orgName type="lab" n="5">Institut indien de statistiques</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="j">Pattern Recognition Letters</title>
<title level="j" type="abbrev">PATREC</title>
<idno type="ISSN">0167-8655</idno>
<imprint><publisher>ELSEVIER</publisher>
<date type="published" when="2000">2000</date>
<biblScope unit="volume">22</biblScope>
<biblScope unit="issue">3–4</biblScope>
<biblScope unit="page" from="431">431</biblScope>
<biblScope unit="page" to="441">441</biblScope>
</imprint>
<idno type="ISSN">0167-8655</idno>
</series>
<idno type="istex">49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9</idno>
<idno type="DOI">10.1016/S0167-8655(00)00126-4</idno>
<idno type="PII">S0167-8655(00)00126-4</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0167-8655</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">There are many types of documents where machine-printed and hand-written texts intermixedly appear. Since the optical character recognition (OCR) methodologies for machine-printed and hand-written texts are different, to achieve optimal performance it is necessary to separate these two types of texts before feeding them to their respective OCR systems. In this paper, we present a machine-printed and hand-written text classification scheme for Bangla and Devnagari, the two most popular Indian scripts. The scheme is based on the structural and statistical features of the machine-printed and hand-written text lines. The classification scheme has an accuracy of 98.6%.</div>
</front>
</TEI>
<affiliations><list><country><li>Inde</li>
</country>
<region><li>Bengale-Occidental</li>
</region>
<settlement><li>Calcutta</li>
</settlement>
<orgName><li>Institut indien de statistiques</li>
</orgName>
</list>
<tree><country name="Inde"><noRegion><name sortKey="Pal, U" sort="Pal, U" uniqKey="Pal U" first="U." last="Pal">U. Pal</name>
</noRegion>
<name sortKey="Chaudhuri, B B" sort="Chaudhuri, B B" uniqKey="Chaudhuri B" first="B. B." last="Chaudhuri">Bidyut Baran Chaudhuri</name>
<name sortKey="Chaudhuri, B B" sort="Chaudhuri, B B" uniqKey="Chaudhuri B" first="B. B." last="Chaudhuri">Bidyut Baran Chaudhuri</name>
<name sortKey="Pal, U" sort="Pal, U" uniqKey="Pal U" first="U." last="Pal">U. Pal</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001A32 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001A32 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:49C2E2049B28B4ABE1CDE28ABF0CA9DF3074F0E9 |texte= Machine-printed and hand-written text lines identification }}
This area was generated with Dilib version V0.6.32. |